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INTRODUCTION 


The  Joint  Oil  Analysis  Program  (JOAP)  coordinates  the  Department  of 
Defense  (DOD)  programs  employing  spectrometric  analyses  of  used  oils  for  con¬ 
dition  monitoring  of  many  types  of  equipment.  Two  hundred- odd  different  oil 
analysis  laboratories  provide  these  spectrometric  analyses,  the  great  majority 
of  which  are  individual -service  owned,  with  the  remainder  being  contract 
facilities  (non  DOD).  Because  of  the  mobility  of  equipment,  it  is  quite  possible 
that  successive  samples  of  used  oil,  from  the  same  piece  of  equipment,  may  not 
be  analysed  on  the  same  instrument.  For  this  reason,  and  numerous  others,  it  is 
highly  desirable  that  the  same  oil  sample,  when  analyzed  by  different  labora¬ 
tories,  should  as  nearly  as  possible  result  in  the  same  contaminant  readings. 


In  the  mid  1970's  JOAP  instituted  their  "correlation"  program,  intended 
to  provide  information  regarding  the  consistency  of  readings  produced  by  the 
spectrometric  instruments  serving  their  needs;  this  program  was  expected  to  moni-  J 

tor  both  internal  consistency  of  repeated  readings  by  the  same  instrument,  as 
well  as  consistency  from  one  instrument  (or  laboratory)  to  another.  The  land-  " 

mark  paper  discussing  this  type  of  problem  is  by  Youden  [3],  which  highlights 
some  empirical  observations  about  instrument -to- instrument  testing  in  general.  } 

The  procedure  Youden  describes  for  checking  laboratory  to  laboratory  consistency 
consists  of  sending  each  participating  laboratory  two  "similar"  samples  of 
unknown  composition;  each  laboratory  receives  the  same  two  samples.  Each  is 
required  to  analyze  both  of  the  samples  (one  time)  and  return  the  pair  of  results  ^ 

to  a  central  processing  location.  If  one  defines 

Xj  -  Analysis  result  for  sample  1,  laboratory  i  . 

y^  -  Analysis  result  for  sample  2,  laboratory  i 

then  the  n  pairs  (x^,  y^),  i  “  1.  2,...,  n,  can  be  represented  as  n  points  j 

in  a  plane.  If  one  plots  these  n  points,  Youden  pointed  out  that  the  resulting 
swarm  of  points  almost  invariably  has  the  general  shape  depicted  in  Figure  1. 

With  coordinate  axes  at  the  medians  (or  means)  of  the  x^  and  y^  values  (as  \ 

in  Figure  2),  the  preponderance  of  points  will  typically  fall  in  the  first  and  , 

third  quadrants,  with  relatively  few  in  the  second  and  fourth.  This  would  ne-  - — ““j 

cessarily  follow  in  a  situation  in  which  a  laboratory  tends  to  get  either  high  •  1 

readings  or  low  readings,  for  both  of  the  two  samples;  if  we  were  to  draw  in  a 
45*  line  and  project  the  points  onto  this  line,  the  resulting  scatter  of  these  n  r 

projected  points  describes  laboratory-to-laboratory  variation.  This  variation  - — — | 


Youden  attributed  to  differences  in  laboratory  technique  (or  could  equally  well 
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be  called  Che  variation  in  accuracy  of  Che  laboratories).  Youden  also  pointed 
out  that  one  can  measure  the  perpendicular  distance  of  each  point  from  this  45* 
line  (i.  e.,  also  project  the  original  points  onto  the  normal  to  the  45*  line) 
to  measure  the  "precision"  (or  repeatability)  of  a  given  laboratory;  scatter  in 
this  direction  should  be  mainly  due  to  the  ability  of  an  individual  laboratory 
to  reproduce  its  own  results.  Youden  suggested  that  limits  defining  acceptable 
laboratory  performance  can  be  constructed  from  the  scatter  or  variation  observed 
in  these  two  directions . 

THE  CURRENT  CORRELATION  PROGRAM 

The  JOAP  correlation  program  was  modelled  after  the  interlaboratory  type 
of  comparison  described  by  Youden,  with  some  important  modifications;  the  basic 
computations  used  in  the  JOAP  correlation  program  are  described  in  [1].  This 
program  is  administered  by  the  JOAP  Technical  Support  Center  (TSC) ,  located  at 
the  Naval  Aix  Rework  Facility,  Pensacola,  FL.  Briefly,-  the  correlation  program 
works  as  follows:  Each  JOAP  laboratory  is  sent  the  same  pair  of  oil  samples 
(actually  2  pairs  of  samples  are  used,  as  described  later),  each  month;  the  par¬ 
ticular  concentrations  of  the  elements  of  interest  in  these  samples  vary  from 
month  to  month  and  are  not  known  by  the  participating  laboratories.  Each 
laboratory  analyzes  the  pair  of  samples  it  receives  (presumably  only  once)  and 
mails  the  results  back  to  the  TSC.  Again,  let  (xi>  y^  represent  the  two 
sample  readings  from  laboratory  i,  for  a  given  element.  The  procedure  described 
in  [1]  first  determines  a  "trimmed"  mean  value  for  the  x's  and  for  the  y's,  in¬ 
dependently.  These  trimmed  means  are  computed  by  arranging  the  given  x  analyses, 
say,  in  order  of  magnitude,  deleting  the  lowest  20%  and  the  highest  20%,  and 
then  averaging  the  remaining  middle  60%.  Note  that  it  is  quite  possible  that 
the  x  score  from  laboratory  1  might  be  trimmed  off,  while  its  y  score  is  not; 
that  is,  a  given  laboratory's  results  may  contribute  to  one  trimmed  mean  and  not 
the  other.  Note  as  well  that  only  60%  of  the  x  scores  received,  and  60%  of  the 
y  scores,  are  used  to  define  these  trimmed  means.  Letting  x^,  and  repre¬ 
sent  these  trimmed  means  (for  a  given  element  and  month),  the  JOAP  correlation 
procedure  locates  a  new  coordinate  system  at  (x^.,  y^) ;  these  trimmed  means 
play  the  role  of  the  medians  in  Youden's  discussion  [3], 

•  Rather  than  constructing  a  45*  line,  as  suggested  by  Youden,  reference  [1] 
uses  a  line  of  slope  S,  where  S  is  determined  by  the  trimmed  means  (x^,  y^.)  and 
constants  Aj  ,  Bj  which  differ  from  element  to  element  and  are  presented  in  Table  1. 


Table  1*.  Constance  used  to  determine  slope 


Element 

A  B 

Fe 

2.0  \  1 

Ag 

1.5  .1 

Al 

2.0  .1 

Cr 

1.5  .1 

Cu 

1.5  .1 

Mg 

1.5  .1 

Si 

1.9  •  .14 

Ti 

1.5  .1 

Ni 

1.5  .1 

Ttte  slope  S  used  for  a  given  element  is  defined  by 
S  -  (A2  +  B2y2) ‘ 5/(A2  +  B2x^) ‘ 5 . 

If  Xf  -  yT,  this  formula  gives  S  -  1  (an  angle  of  45*);  indeed,  if  the  two  samples 
sent  to  a  given  laboratory  have  essentially  equal  concentrations  of  a  given  element 
(the  guidelines  call  for  the  two  to  differ  by  no  more  than  15%),  then  this  com¬ 
puted  slope  will  not  differ  from  1  by  a  great  deal.  Reference  [1]  says  this 
formula  is  intended  to  avoid  "an  error  unless  the  composition  of  the  material 
being  measured  is  identical  in  the  two  samples”.  It  is  not  clear  what  this 
expected  error  might  have  been,  nor  does  it  appear  that  the  slope  used  by  the 
procedure  will  materially  differ  from  1  as  Youden  suggested.  The  pair  of  readings 
(Xj,  y^  are  then  projected  onto  the  line  with  slope  S  (giving  the 
accuracy  score  for  the  laboratory)  and  onto  the  line  which  is  normal  to  this 
line  with  slope  S  (giving  the  repeatability  score  for  the  laboratory) . 

The  major  way  in  which  the  computations  for  the  JOAP  correlation  pro¬ 
gram  differ  from  the  procedure  suggested  by  Youden  is  in  the  manner  in  which 
accuracy  (laboratory  variation)  and  repeatability  (variability  of  repeated 
analyses  by  the  same  laboratory  with  the  same  sample)  are  assessed.  Reference 
[1]  mentions  a  current  (1973)  laboratory  certification  program  which  was  de¬ 
signed  to  assure  that  each  laboratory  could  meet  minimum  standard  performance 
criteria.  This  certification  program  calls  for  the  laboratory  to  conduct  a  se¬ 
quence  of  ten  separate  analyses  of  a  prepared  oil  standard  with  known  concentra¬ 
tion  c,  say,  of  a  given  element.  The  accuracy  index  (AI)  of  the  laboratory  for 


this  element  is  the  magnitude  of  the  difference  between  the  known  concentration, 
c,  and  the  average  of  the  laboratory's  10  analyses;  the  repeatability  index  (RI) 
of  the  laboratory  for  this  element  is  the  sample  standard  deviation  of  the  10 
analyses,  computed  in  the  usual  way.  The  acceptable  limit  for  AX  is 

M  -  (A2  +  B2c2)*5 

where  the  A  and  B  values  are  those  given  in  Table  1  above,  for  the  specific 
element.  The  laboratory  passes  the  accuracy  certification  for  this  element  so 
long  as  AI  <  M;  it  passes  the  repeatability  certification  so  long  as  RI  <  M/2. 
Thus  the  constants  given  in  Table  1  were  initially  proposed  for  this  certifica¬ 
tion  program,  and  were  undoubtedly  derived  from  some  physical  model  of  the  way 
in  which  a  particular  type  of  instrument  should  behave,  based  on  ten  repeated 
analyses  of  the  same  sample. 

In  the  correlation  program,  the  accuracy  criterion  for  a  given  element 
is  defined  to  be 

a  -  (2A2  +  B2(x2  +  y2))'5 

and  the  repeatability  criterion  is  a/2,  where  the  constants  A  and  B  again  come 
from  Table  1  above.  Note  that  a  is  in  fact  the  square  root  of  the  sum  of  the 
squares  of  the  M  values  for  the  two  samples,  with  the  trimmed  means  x^,,  yT 
playing  the  roles  of  the  known  concentrations  c.  It  is  curious  that  these  same 
constants  should  be  used  in  the  correlation  program,  where  each  of  two  different 
samples  is  to  be  analyzed  one  time,  not  ten,  and  presumably  any  type  of  instru¬ 
ment  might  be  used.  Each  laboratory  then  is  judged  on  its  accuracy  and  repeat¬ 
ability  performance  for  each  element  (each  month).  If  the  magnitude  of  its  ac¬ 
curacy  score  exceeds  a,  it  fails  on  accuracy,  and  if  the  magnitude  of  its  repeat¬ 
ability  score  exceeds  a/2,  it  fails  repeatability.  This  way  of  defining  accept¬ 
able  limits  for  the  two  types  of  scores  depends  only  on  the  trimmed  means  (and 
the  constants  A  and  B)  and  in  no  way  on  the  actual  scatter  of  the  observed  data 
themselves,  contrary  to  Youden's  suggestion.  It  also  leads  to  quite  erratic  be¬ 
havior,  in  a  certain  sense,  which  will  be  explored  below. 

It  was  mentioned  earlier  that  the  correlation  program  actually  sends  two 
pairs  of  samples  to  each  laboratory  each  month.  One  pair  of  samples  is  prepared 
by  the  TSC  in  new  oil,  using  organo-metallic  concentrates  with  added  sulfonate; 


it  is  possible  Co  control  the  contamination  levels  of  all  elements  of  interest 
fairly  well  with  these  samples.  This  pair  of  samples  is  referred  to  as  "syn¬ 
thesized"  samples.  In  addition  to  the  pair  of  synthesized  samples,  the  TSC  also 
sends  each  laboratory  a  pair  of  used  engine  oil  samples.  These  are  made  from 
used  contaminated  oils  and,  as  such,  should  behave  more  like  actual  oil  samples 
the  laboratories  are  expected  to  analyze  daily.  It  is  much  more  difficult  for 
the  TSC  to  exert  control  over  the  contaminant  levels  in  these  samples;  fre¬ 
quently  the  same  powdered  metallic  contaminants  used  for  the  synthesized  samples 
are  added  to  the  used  oil  samples  to  adjust  the  contaminant  levels.  Thus  the 
correlation  program  monitors  the  laboratory  performances  on  both  types  of  sam¬ 
ples. 


A  second  dichotomy  exists  in  the  correlation  program,  defined  by  the 
physical  principle  employed  by  the  instrument  in  measuring  concentration. 

Roughly  80%  of  the  instruments  used  in  JOAP  are  atomic  emission  (AE)  spectro¬ 
meters.  In  these  instruments  the  sample  material  (the  oil)  is  excited  by  an 
electric  spark  and  the  spectral  lines  of  the  light  emitted  are  used  to  measure 
concentrations.  The  remaining  20%  of  the  instruments  used  are  atomic  absorp¬ 
tion  (AA)  spectrometers.  In  these  instruments  the  sample  material  is  excited 
by  a  gas  flame,  while  illuminated  by  a  light  of  known  composition;  the  amount 
of  the  known  light  absorbed,  at  specific  spectral  lines,  is  used  to  determine 
the  concentrations  in  the  sample.  Because  of  these  different  physical  bases 
for  measurement,  it  is  well  known  that  the  resulting  concentration  scales  are 
not  identical.  The  correlation  program  computations  are  carried  out  separately 
for  these  two  types  of  instrument.  Thus  a  typical  JOAP  correlation  program  re¬ 
port  contains  two  major  partitions:  one  describing  the  behavior  of  the  AE  in¬ 
struments  and  the  other  describing  the  behavior  of  the  AA  instruments.  Within 
each  of  these  two,  the  behaviors  for  synthesized  oil  samples  and  for  used  oil 
samples  are  examined  separately,  computing  the  trimmed  means,  projecting  the 
readings  onto  "accuracy"  and  "repeatability"  axes,  etc.,  for  each  element  of 
interest.  Although  not  mentioned  in  [1],  it  is  undoubtedly  true  that  the  con¬ 
stants  A  and  B  in  table  1,  used  in  defining  the  accuracy  limit  a,  are  derived 
from  a  theoretical  model  of  the  behavior  of  a  particular  atomic  emission  instru¬ 
ment;  nevertheless,  the  same  constants  are  employed  with  the  AA  instruments.  At 
the  present  time,  the  same  9  elements  are  monitored  for  both  types  of  instrument : 
iron  (Fe) ,  silver  (Ag) ,  aluminum  (Al) ,  chromium  (Cr) ,  copper  (Cu) ,  magnesium  (Mg), 
silicon  (Si),  titanium  (Ti) ,  and  nickel  (Ni). 

6 


The  correlation  program  summarizes  the  monthly  behavior  of  each  partici¬ 
pating  JOAP  instrument  by  a  single  score  ,  combining  behavior  over  the  synthesized 
and  used  oil  samples.  This  score  is  arrived  at  by  subtracting  from  100  a  certain 
number  of  points  for  each  element  that  the  instrument  fails  to  pass  (because  of 
its  accuracy  result  or  its  repeatability  result  or  both)  for  each  sample  type 
for  each  month.  Table  2  presents  the  number  of  points  lost  for  each  element. 


Table  2.  Number  of  points  lost  for  failing  accuracy 
and/or  repeatability,  either  sample  type. 


Element 

Fe 

Ag 

A1 

Cr 

Cu 

Mg 

Si 

Ti 

Ni 

Points 

9 

6 

4 

4 

9 

4 

4 

6 

4 

Thus,  if  laboratory  1,  say,  had  failed  accuracy  for  Fe,  both  accuracy  and  re¬ 
peatability  for  Cr,  with  synthetic  samples,  and  only  failed  Cr  with  used 
samples,  its  monthly  score  would  be  100  -  9  -  4  -  4  -  83..  If  laboratory  2 
failed  accuracy  for  Si  and  Ti,  synthetic  samples,  and  both  accuracy  and  re¬ 
peatability  for  Cu,  used  samples,  its  score  would  be  100  -4-6-9-81. 

If  a  laboratory  fails  either  accuracy  or  repeatability  for  every  element,  for 
both  types  of  samples,  notice  its  score  would  be  0.  These  monthly  scores  are 
used  in  the  correlation  program  to  track  laboratory  performance  over  time.  The 
laboratories'  6  month  average  score  is  computed  and  used  for  certification  of 
the  laboratory.  If  this  6  month  average  score  is  below  80  for  three  consecutive 
months,  the  laboratory  may  be  decertified;  if  the  6  month  average  score  lies  be¬ 
tween  80  and  90  for  3  consecutive  months,  the  laboratory  is  provisionally  certi¬ 
fied.  For  all  other  cases  the  laboratory  is  continued  to  be  certified. 

As  mentioned  earlier,  the  acceptable  limits  for  accuracy  scores  and  re¬ 
peatability  scores  depend  only  on  the  trimmed  means  x^.,  y^  and  the 
appropriate  constants  from  Table  1;  they  do  not  depend  on  the  actual  scatter 
of  the  accuracy  or  repeatability  scores  themselves.  This  causes  both  the  accu¬ 
racy  and  the  repeatability  limits,  which  define  the  acceptable  values,  to  jump 
around  a  great  deal,  in  terms  of  the  number  of  standard  deviations  they  repre¬ 
sent  (away  from  the  means,  which  are  0).  Tables  3,  4,  5,  and  6  illustrate  this 
phenomenon  for  the  correlation  data  collected  for  August,  1986.  These  tables 
summarize  the  number  of  instruments  of  the  two  types  that  submitted  analysis  re- 


Table  3.  August,  1986,  summary  of  correlation  scores 


t 


Correlation  Scores  Atomic  Emission 

Used  Oils  August  1986 


Limit 

Accuracy 

#StDev 

#Fail 

Repeatability 
Limit  #StDev 

#Fail 

Number 
of  Labs 

Fe 

3.1405 

3.2027 

10 

1.5703 

3.8917 

0 

183 

Ag 

2.1213 

17.2925 

0 

1.0607 

8.5170 

0 

183 

A1 

2.8284 

52.0000 

3 

1.4142 

26.0000 

0 

183 

Cr 

2.1215 

3.2357 

0 

1.0607 

6.4067 

0 

183 

2.5010 

3.1894 

11 

1.2505 

3.8719 

0 

183 

Mg 

2.7776 

1.9751 

32 

1.3888 

1.9253 

3 

183 

Si 

2.8164 

3.4913 

5 

1.4082 

3.1023 

6 

183 

Ti 

2.1236 

3.0166 

1 

1.0618 

4.9184 

0 

183 

Mo 

2.1291 

2.5732 

10 

1.0646 

2.5786 

1 

182 

Ni 

2.1228 

3.0808 

1 

1.0614 

4.4233 

0 

183 

Table  4.  August,  1986,  summary  of  correlation  scores. 


Correlation  Scores  Atomic  Emission 

Synthetic  Oils  August  1986 


Limit 

Accuracy 

#StDev 

#Fail 

Repeatability 
Limit  #StDev 

#Fail 

Number 
of  Labs 

Fe 

5.5883 

3.9480 

4 

2.7941 

3.0941 

0 

183 

Ag 

3.0275 

2.1700 

12 

1.5137 

1.7183 

3 

183 

A1 

5.9450 

3.7681 

7 

2.9725 

3.2787 

0 

183 

Cr 

2.9595 

2.9511 

4 

1.4798 

3.2800 

1 

183 

Cu 

3.8884 

2.4030 

10 

1.9442 

1.4166 

5 

183 

Mg 

8.5668 

2.8151 

20 

4.2834 

2.2899 

4 

183 

Si 

15.9339 

4.8487 

5 

7.9670 

4.6366 

0 

183 

Ti 

3.8749 

2.6859 

9 

1.9374 

2.8417 

1 

183 

Mo 

2.6922 

1.8139 

9 

1.3461 

2.5420 

1 

182 

Ni 

3.2006 

3.6200 

2 

1.6003 

3.1763 

0 

183 

Table  5.  August,  1986,  summary  of  correlation  scores 


Correlation  Scores  Atomic  Absorption 

Used  Oils  August  1986 


Limit 

Accuracy 

#StDev 

#Fail 

Repeatability 
Limit  #StDev 

#Fail 

Number 
of  Labs 

Fe 

2.9089 

2.3724 

6 

1.4545 

3.8233 

0 

37 

Ag 

2.1213 

10.5623 

1 

1.0607 

5.2812 

0 

37 

Al 

2.8292 

4-0772 

0 

1.4146 

8.4949 

0 

37 

Cr 

2.1213  - 

3 .  ->342 

0 

1.0607 

oo 

0 

37 

Cu 

2.2341 

3.6256 

0 

1.1171 

3 . 3077 

0 

37 

Mg 

2.2976 

2.2609 

7 

1.1488 

3 .3621 

0 

37 

Si 

2.7267 

2.6933 

11 

1.3634 

8.6240 

3 

31 

Ti 

2.1216 

3.5411 

5 

1.0608 

ao 

0 

32 

Mo 

2.1213 

00 

1 

1.0607 

00 

1 

14 

Ni 

2.1220 

3.1295 

5 

1.0610 

00 

1 

35 

Table  6.  August,  1986,  summary  of  correlation  scores. 


Correlation  Scores  Atomic  Absorption 

Synthetic  Oils  August  1986 


Limit 

Accuracy 

#StDev 

#Fail 

Repeatability 
Limit  #StDev 

#Fail 

Number 
of  Labs 

Fe 

5.4000 

2.2351 

7 

2.7000 

3 . 1609 

-  0 

37 

Ag 

2.9638 

2.1089 

8 

1.4819 

3.3577 

1 

37 

Al 

5.5213 

2.0836 

7 

2.7607 

2.7949 

2 

37 

r 

2.7500 

1.6746 

13 

1.3750 

3 . 1476 

1 

37 

Cu 

3.6054 

3.5904 

4 

1.8027 

3.9123 

1 

37 

Mg 

8.9353 

1.9332 

14 

4.4676 

3.7263 

3 

37 

Si 

15.1153 

2.3767 

11 

7.5577 

2.9521 

0 

31 

Ti 

3.8108 

1.9218 

12 

1.9054 

2.6810 

0 

32 

Mo 

2.7003 

1.6592 

4 

1.3501 

3 . 0514 

1 

14 

Ni 

3.0897 

2.2816 

9 

1.5448 

3 . 0922 

2 

35 

suits  for  the  various  elements,  as  well  as  the  computed  acceptable  limits  for 
accuracy  and  repeatability,  for  both  types  of  samples.  In  addition,  the  actual 
standard  deviations  of  the  scores  have  been  computed.  Recall  that  40%  of  the 
data  was  trimmed,  for  both  the  x  (Sample  one)  scores  and  for  the  y  (Sample  two) 
scores,  in  locating  the  origin  for  the  accuracy  and  repeatability  axes.  The 
standard  deviations  used  in  this  discussion  are  computed  from  the  readings  pro¬ 
vided  by  those  instruments  which  were  kept  after  the  trimming,  for  one  or  both 
of  the  two  samples.  In  every  case,  the  means  for  these  values  are  essentially  0, 
so  the  standard  deviations  were  computed  about  0.  The  column  labelled  #StdDev 
gives  the  ratio  of  the  limit  (given  in  the  column  labelled  Limit)  for  the  given 
variable  divided  by  this  computed  standard  deviation.  Note  that  the  number  of 
standard  deviations  which  the  limits  represent  vary  quite  widely  from  element  to 
element,  especially  for  the  used  samples  (for  both  types  of  instrument).  They 
also  vary  quite  widely  from  month  to  month;  Appendix  A  presents  the  same  type  of 
data- for  one  additional  month,  January,  1985,  the  only  other  month  for  whi-ch  we 
have  all  the  necessary  data  available  in  electronic  form  for  AE  instruments . 

For  AA  instruments,  an  additional  18  months  of  data  has  been  available  to  us; 
although  not  included  with  this  report,  there  is  tremendous  variation. in  the 
ratio  of  the  limit  divided  by  the  standard  deviation  from  element  to  element,  and 
from  month  to  month  for  the  same  element.  There  does  not  seem  to  be  any  logical 
reason  that  one  would  like  this  type  of  ratio  to  vary  in  this  way.  It  would 
seem  to  indicate  that  there  is  a  wide  variablity  in  the  ease  with  which  a  labora¬ 
tory  could  meet  the  accuracy  and  repeatability  requirements  from  month  to  month. 

The  instruments  read  out  concentration  values  to  the  nearest  .1;  these 
values  are  reported  to  the  TSC.  The  data  entered  into  the  computer  for  the 
correlation  program  computations  is  rounded  to  the  closest  integer,  causing  a 
large  number  of  pairs  of  analyses  to  be  identical.  This  phenomenon  in  turn  can 
lead  to  all  of  the  nontriraraed  accuracy  scores  (or  equally  well  the  repeatability 
scores),  which  are  used  to  compute  the  standard  deviation,  being  equal;  such  a 
standard  deviation  then  is  0  and  the  ratio  of  the  computed  limit  to  such  a  stan¬ 
dard  deviation  is  of  course  undefined.  This  situation  is  labelled  by  the  symbol 
00  in  the  #StdDev  column  (see  e.  g. ,  Table  5,  Cr ,  repeatability).  If  the  data 
were  entered  with  full  accuracy  (including  tenths)  it  is  expected  this  phenome¬ 
non  will  not  occur  very  frequently. 
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■  1  V 


SUGGESTED  IMPROVEMENTS 


The  paper  by  Youden,  discussed  earlier,  led  to  the  publication  of  a 
number  of  additional  contributions  to  the  literature  about  interlaboratory 
comparisons.  An  interesting  paper  [2}  was  published  by  Mandel  and  Lashof, 
giving  interpretations  and  more  mathematical  discussions  of  Youden' s  ideas,  one 
year  after  [1]  discussed  the  JOAP  correlation  program  as  now  implemented.  Among 
other  things,  Mandel  and  Lashof  give  some  models  which  make  the  scatter  of  points 
mentioned  by  Youden  seem  natural,  as  well  as  changes  to  these  models  which  could 
reasonably  lead  to  quite  different  looking  plots.  They  suggest  that  the  bivari¬ 
ate  normal  distribution  provides  a  good  model  for  the  original  pairs  of  sample 
readings  (x^,  y^);  if  one  lets  x,  y  represent  the  means  of  the  observed  pairs, 
then  the  pairs  (x^  -  x,  y^  -  y)  will  be  bivariate  random  variables  with  means 
equal  to  zero.  One  can  then  use  principal  components  to  find  the  direction  of 
the  axis  which  includes  the  greatest  variability;  for  some  simple  reasonable 
types  of  structures  this  direction  turns  out  to  be  the  line  with  slope  1,  the 
phenomenon  pointed  out  by  Youden.  The  orthogonal  direction  is  the  one  with  the 
least  variability,  and  is  free  of  effects  of  different  instruments  under  a  stan¬ 
dard  type  of  linear  model;  for  the  types  of  samples  used  in  the  JOAP  correlation 
program,  it  would  appear  that  the  simple  type  of  linear  model  they  discuss  should 
be  appropriate.  The  following  discussion  incorporates  some  of  the  ideas  and 
suggestions  made  by  Mandel  and  Lashof. 

For  a  given  month,  for  a  given  element,  and  type  of  instrument,  let 
(x^,  ^i^  represent  the  observed  pairs  of  analyses  received,  i  -  1,  2,...,  n, 
where  n  is  the  number  of  instruments.  Let  us  assume  that 

x^  -•  +  L^  +  e^ , 

yi  -  **2  -  Li +  £i' 

for  i  -  1,  2,...,  n.  and  \i ^  represent  the  "true"  contents  of  samples  1  and  2, 

respectively;  L^  is  meant  to  represent  a  laboratory  effect  that  is  constant  for 
both  x^  and  y^,  the  two  sample  readings  received  from  the  same  laboratory, 
e^  and  f^  represent  independent  random  measurement  errors  (or  noise)  for  the  two 
analyses.  It  seems  quite  reasonable  to  assume  that  the  e^  and  f^  values  are 
independent  and  normally  distributed  with  the  same  variance;  as  Mandel  and  Lashof 
suggest,  one  can  also  assume  that  the  laboratory  effects,  L^ ,  are  normally  distri- 
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buted.  1c  follows  then  that 


-  x  -  -L  +  -  e 

Yi  “  *i  *  y  “  Li  *L  +  fi  '  £ 

that  is,  the  pairs  (X^,  Y^)  do  not  depend  on  the  true  contents  M 2>  but  only 

on  the  laboratory  effects  and  the  measurement  errors.  The  projection  of  (X^  Y^ 
onto  the  45*  line,  times  the  square  root  of  two,  then  is 

X£  +  Yt  -  2(Lt  -  l)  +  (ej  -  e)  +  (f£  -  £) . 

Note  that  this  sum  is  affected  by  the  laboratory  effects  as  well  as  the 
measurement  errors.  The  projection  of  (X^,  Y^)  onto  the  line  normal  to  the  45* 
line,  times  the  square  root  of  two,  is 

X-  -  Y.  -  (ei  *  «>  *  (fi  ‘  £)t 

a  quantity  which  depends  only  on  the  measurement  errors,  and  not  the  laboratory 
effects  (or  the  true  contents).  With  this  simple  type  of  additive  model  it  in¬ 
deed  turns  out  that  the  projections  on  the  45*  line  give  a  reasonable  idea  of 
accuracy,  or  spread,  among  the  different  laboratories  and  the  projections  on  the 
orthogonal  axis  depend  only  on  the  measurement  errors,  or  repeatability,  of  an 
instrument's  readings. 

It  is  not  uncommon  for  "wild"  points  to  occur  in  using  sensitive  instru¬ 
ments  to  make  fine  measurements;  undoubtedly  the  reason  for  trimming  the  data  in 
the  correlation  program  is  to  remove  these  effects.  While  we  agree  with  this 
general  principle  (using  trimming  to  remove  outliers)  it  also  seems  that  40% 
trimming  is  very  extreme.  The  idea  of  independent  trimming  of  the  two  samples 
is  also  not  particularly  appealing,  allowing,  as  already  mentioned,  the  possi¬ 
bility  that  an  instrument's  y-score  is  trimmed,  but  its  x-score  is  not.  We  have 
made  a  preliminary  investigation  into  trimming,  possible  methods  for  doing  it  as 
well  as  the  amount  of  trimming  to  apply.  It  has  been  pointed  out  some  time  ago 
that  the  untrimmed  means  and  the  trimmed  means,  using  actual  correlation  data, 
result  in  essentially  the  same  value  for  all  elements  for  all  months.  That  is, 
because  of  the  number  of  laboratories  participating,  and  the  apparent  fact  that 
"wild"  points  seem  to  occur  symmetricly  (some  big,  some  small),  the  location  is 
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essentially  unchanged  if  one  uses  untrimned  means  instead  of  trimmed  means. 

Since  the  current  correlation  program  computations  depend  only  on  the  trimmed 
means  (and  constants) ,  one  would  get  the  same  scores  and  results  if  one  used 
the  full  set  of  raw  data  with  no  trimming.  If,  however,  one  wants  to  use  the 
observed  scatter  or  spread  on  the  accuracy  and  repeatability  axes  to  determine 
limits  for  acceptable  behavior,  "wild"  points  could  seriously  inflate  the  re¬ 
sults;  thus  we  are  in  favor  of  applying  some  trimming  before  establishing 
limits  for  accuracy  and  repeatability.  We  are  also  in  favor  of  the  philosophy 
of  bivariate  trimming:  if  a  laboratory  is  trimmed  on  the  x-scale  it  is  also 
necessarily  trimmed  on  the  y- scale. 

There  are  many  different  ways  to  implement  bivariate  trimming.  If  one 
adopts  the  suggestions  of  normality  put  forth  by  Mandel  and  Lashof,  it  would 
seem  natural  to  use  the  constant  contours  of  the  bivariate  normal  density  func¬ 
tion  to  accomplish  the  trimming.  Letting  (X^,  Y^)  be  as  defined  above,  this 
means  computing  and  inverting  a  2  by  2  matrix  (details  are  given  in  Appendix  B),and 
then  evaluating  quadratic  forms  (locating  the  contours  which  contain  the  observed 
points).  Those  points  most  distant  from  the  origin  are  the  candidates  for  trim¬ 
ming.  Figure  3  presents  a  typical  scatter  of  observed  sample  results,  with  3 
of  the  bivariate  normal  constant  contours  drawn  in.  We  have  applied  this  type 
of  trimming  to  the  August,  1986,  and  January,  1985,  data  available  to  us;  20%, 

10%  and  5%  trimming  were  looked  at  and  for  these  data  it  appears  that  5%  contour 
trimming  does  a  sufficiently  good  job  for  both  types  of  instrument.  Thus  for  the 
20  points  pictured  in  Figure  3,  5%  trimming  would  delete  the  single  point  on  the 
outtermost  ellipse. 

We  recommend  that  for  a  period  of  time  (6  months  or  more)  the  scores  for 
the  correlation  program  be  computed  as  present  and  compared  with  scores  devel¬ 
oped  essentially  according  to  Youden's  original  suggested  procedure  for  inter¬ 
laboratory  comparisons.  Specifically,  this  second  set  of  scores  will  employ 
5%  bivariate  trimming  (as  mentioned  above  and  defined  in  Appendix  B)  to  guard 
against  "wild"  points.  The  trimmed  means  x,  y  are  computed  from  the  remaining 
observatons  and  used  to  define  -  x^  -  x  and  Y^  -  -  y,  centering  at  these 

trimmed  means.  Then  all  observed  pairs  (X^,  Y^)  will  be  projected  onto  the  45°  line 
(accuracy  scores)  and  onto  the  line  normal  to  this  line  (repeatability  scores). 

The  original  untrimmed  values  are  then  used  to  compute  standard  deviations  in  each 
of  these  two  dimensions  (see  Appendix  B) ,  which  in  turn  are  used  to  define  the 
acceptable  accuracy  and  repeatability  limits.  Any  laboratory  which  has  an 
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accuracy  score  whose  magnitude  exceeds  the  accuracy  limit  fails  on  accuracy; 
any  laboratory  which  has  a  repeatability  score  whose  magnitude  exceeds  the  -re¬ 
peatability  limit  fails  on  repeatability. 


The  number  of  standard  deviations  to  use  in  defining  the  accuracy  and 
repeatability  limits  is,  of  course,  arbitrary,  and  can  be  set  at  any  level  de¬ 
sired.  We  recommend  that  3  standard  deviations  be  used  at  least  initially; 
thus  the  accuracy  limit  will  be  3sa  and  the  repeatability  limit  will  be  3sr,  where 
sfl  and  sr  are  the  computed  standard  deviations.  A  rationale  for  using  3  as  a  mul¬ 
tiplier  is  given  below.  This  procedure  has  been  applied  to  both  the  August, 

1986,  and  the  January,  1985  data;  tables  7,  8,  9,  and  10  present  the  resulting 
limits  for  the  January,  1986  data  and  are  comparable  to  tables  3,  ..,6  presented 
earlier.  The  computations  for  January,  1985,  are  also  presented  in  Appendix  A. 

In  comparing  the  correlation  scores  with  these  proposed  scores,  perhaps 
the  most  apparent  difference  is  the  increased  number  of  laboratories  which  fail 
on  repeatability  (and  relatively  fewer  on  accuracy).  It  would  appear  that  the 
correlation  program  method  of  determining  the  limits  for  repeatability  (which 
depend  only  on  the  60%  trimmed  means)  does  not  provide  an  effective  check.  Note 
as  well  that  the  AA  instruments  in  general  fare  much  better  with  the  proposed 
method  than  they  do  with  the  current  correlation  procedure.  As  mentioned  above, 
the  procedure  for  determining  the  limits  in  the  correlation  program  includes  the 
constants  A  and  B  which  were  undoubtedly  derived  for  an  AE  instrument  and  do  not 
perform  well  for  an  AA  instrument.  It  is  also  of  interest  to  compare  which  par¬ 
ticular  instruments  fail  on  accuracy  and/or  repeatability  for  the  correlation 
program  versus  the  proposed  procedure.  Appendix  A  presents  this  information 
for  the  August,  1986,  and  January,  1985  data. 

One  rationale  for  determining  the  number  of  standard  deviations  to  use 

in  setting  the  accuracy  and  repeatability  limits  can  be  defined  in  terms  of  the 

chances  of  an  instrument,  which  performs  correctly,  passing  both  the  accuracy 

and  repeatability  tests,  for  all  elements,  for  both  used  and  synthetic  samples. 

Let  p  represent  the  probability  that  a  correctly  functioning  instrument  will 

fail  the  check,  for  either  accuracy  or  repeatability  (the  same  value  for  both). 

Then  the  probability  that  it  will  pass  both  accuracy  and  repeatability,  for  any 

element,  is  (1-p)  and  the  probability  it  will  pass  for  all  nine  elements  for  both 

36 

used  and  synthetic  samples  (assuming  independence)  is  (1-p)  Suppose 

we  set  this  quantity  equal  to  .9;  this  gives  the  value  for  p  to  be  1  -  . 9^/^ ,  or 
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Table  7 .  Proposed  method  for.  determining  scores 


Proposed*  Scores  Atomic  Emission 

Used  oils  August  1986 


StDev 

Accuracy 

Limit 

#Fail 

Repeatability 
StDev  Limit 

#Fail 

Number 
of  Labs 

Fe 

1.2649 

3.7947 

4 

.3772 

1.1316 

2 

183 

Ag 

.4575 

1.3725 

0 

.0000 

.0000 

5 

183 

A1 

.3116 

.9347 

14 

.0000 

.0000 

15 

183 

Cr 

.6408 

1.9223 

0 

.1054 

.3162 

10 

183 

Cu 

1.1871 

3.5613 

6 

.2963 

.8888 

0 

183 

Mg 

1.8946 

5.6838 

8 

.3643 

1.0928 

3 

183 

Si 

.9489 

2.8467 

5 

.3752 

1.1257 

8 

183 

Ti 

.7609 

2.2826 

1 

.1571 

.4712 

18 

183 

Mo 

1.0697 

3.2091 

0 

.3541 

1.0623 

1 

182 

Ni 

.6876 

2.0629 

2 

.2243 

.6729 

22 

183- 

Table  8.  Proposed  method  for  determining  scores. 


Proposed  Scores  Atomic  Emission 

Synthetic  Oils  August  1986 


StDev 

Accuracy 

Limit 

#Fail 

Repeatability 
StDev  Limit 

#Fail 

Number 
of  Labs 

Fe 

1.9422 

5.8265 

2 

.7871 

2.3614 

2 

183 

Ag 

1.4674 

4.4022 

3 

.3993 

1.1978 

6 

183 

A1 

2.2865 

6.8594 

7 

.8169 

2.4506 

1 

183 

Cr 

1.0171 

3.0512 

3 

.4124 

1.2371 

2 

183 

Cu 

1.6348 

4.9044 

5 

.5399 

1.6197 

5 

183 

Mg 

4.4506 

13.3518 

2 

1.6082 

4.8246 

2 

183 

Si 

4.8085 

14.4255 

5 

1.5389 

4.6168 

2 

183 

Ti 

1.6961 

5.0882 

4 

.5924 

1.7772 

1 

183 

Mo 

1.4966 

4.4898 

1 

.4855 

1.4564 

1 

182 

Ni 

1.0884 

3.2651 

2 

.4284 

1.2851 

5 

183 
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Table  9.  Proposed  method  for  determining  scores. 


Proposed  Scores  Atomic  Absorption 

Used  Oils  August  1986 


StDev 

Accuracy 

Limit 

#Fail 

Repeatability 
StDev  Limit 

#Fail 

Number 
of  Lab: 

2.1078 

6.3234 

0 

.3357 

1.0071 

0 

37 

.4452 

1.3356 

1 

.2226 

.6678 

0 

37 

.8396 

2.5188 

0 

.1621 

.4863 

2 

37 

.5902 

1.7706 

0 

.0000 

.0000 

0 

37 

.7071 

2.1213 

0 

.3285 

.9856 

0 

37 

2.5244 

7.5731 

0 

.4008 

1.2025 

0 

37 

3.1192 

9.3576 

1 

.4348 

1.3045 

3 

3  3 

2.6074 

7.8221 

1 

.1739 

.5217 

2 

32 

.9449 

2.8347 

1 

.5669 

1.7008 

1 

14 

1.1114 

3.3343 

2 

.0000 

.0000 

1 

35 

Table  10.  Proposed  method 


Proposed  Scores 
Synthetic  Oils 


StDev 

Accuracy 

Limit 

#Fail 

4.1044 

12.3132 

1 

2.5229 

7.5687 

1 

5.7968 

17.3905 

2 

3.0909 

9.2727 

2 

2.0240 

6.0719 

2 

19.1620 

57.4860 

2 

35.6186 

106.8559 

0 

5.8034 

17.4102 

2 

3.3033 

9.9100 

0 

3.9572 

11.8716 

2 

for  determining  scores. 


Atomic  Absorption 
August  1986 


Repeatability 
StDev  Limit 

#Fail 

Number 
of  Lab; 

.8962 

2.6886 

0 

37 

.4309 

1.2928 

1 

.9968 

2.9905 

0 

37 

.5297 

1.5890 

1 

37 

.6409 

1.9226 

1 

1.8851 

5.6553 

2 

2 . 3484 

7.0452 

1 

33 

.7199 

2.1597 

0 

3? 

3 . 6025 

10.8075 

1 

14 

.7932 

2.3797 

2 

35 

17 


'if. 
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.9971.  Ulth  the  assumption  of  normality  for  the  measurement  errors  for  the  in¬ 
struments,  and  for  the  variation  between  instruments,  we  then  require  the  num¬ 
ber  of  standard  deviations  that  a  normal  random  variable  will  exceed  with  prob¬ 
ability  .00145  (half  the  value  of  1  -  .9971  -  .0029,  since  the  projected  scores 
can  be  extreme  either  positively  or  negatively).  This,  in  turn,  results  in  a 
requirement  of  2.978  standard  deviations,  which  we  have  rounded  to  3  for  a  trial 
of  the  proposed  system. 


BIBLIOGRAPHY 

[1]  Clark,  W.  E. ,  Jr.,  "A  Procedure  for  Interlaboratory  Correlation  of 

Fluid  Analysis  Sp<  ^trometers",  U.  S.  Naval  Weapon  Systems  Analysis 
Office,  Quantico,  Va. ,  22134,  VSAO-TM-731,  1973. 

[2]  Mandel,  John,  and  Lashof,  T.  W. ,  "Interpretation  and  Generalization  of  * 

Youden's  Two-Sample  Diagram",  Journal  of  Quality  Technology.  Vol.  6, 
No.  1,  1974. 

[3]  Youden,  W.  J.,  "Graphical  Diagnosis  of  Interlaboratory  Test  Results",  In¬ 

dustrial  Quality  Control.  Vol.  15,  No.  11,  1959. 


APPENDIX  A 


This  appendix  presents  additional  numerical  tables;  Tables  A1  through 
A8  present  the  correlation  program  data  and  the  proposed  program  data  for  the 
month  of  January,  1985;  the  format  and  information  presented  is  identical  to 
that  given  earlier  in  Tables  3  through  10,  for  August  1986. 


Table  A1.  January,  1985,  auxiliary  of  correlation  scores. 

Correlation  Scores  Atomic  Emission 

Used  Oils  January  1985 

I  Accuracy  I  .  Aepeatability  Number 

Limit  AStOev  #Fai l  Limit  AStDev  AFail  of  Labs 


2.1260 

2.1306 

2.1243 


9.9737 

2.6244 

3.0390 


1.0630 

1.0653 

1.0622 


4.9869 

1.9102 

3.3791 


Table  A2.  January,  1985,  summary  of  correlation  scores. 


Correlation  Scores 
Synthetic  Oils 


Limit 

Accuracy 

AStOev 

AFail 

Repeatability 
Limit  AStOev 

_ 

10.1637 

3.9005 

6 

5.0819 

3.0955 

2.6760 

2.6029 

19 

1.3380 

3.1013 

3.6902 

2.4123 

12 

1.8451 

2.2952 

2.8611 

3.5102 

2 

1.4305 

2.4633 

2.8887 

2.9225 

5 

1 .4444 

3.0815 

4.8431 

3.0231 

16 

2.4215 

2.3116 

26.1873 

4.1619 

8 

13.0937 

3.8006 

3.0830 

3.0546 

19 

1.5415 

2.5627 

2.1234 

3.0510 

2 

1.0617 

2.4013 

2.5477 

3.6203 

2 

1.2739 

2.8681 

Atomic  Emission 
January  1985 
lity  I  Number 

itOev  #Fai  l  of  Labs 


Table  A3.  January,  1985,  summary  of  correlation  scores. 

Correlation  Scores  Atomic  Absorption 

Used  Oils  January  1985 

Accuracy  I  Repeatability  I  Number 

Limit  fStOev  AFail  Limit  AStOev  *Fail|  of  Labs 


2.1260 

2.8149 

2.1299 

2.7107 

2.1214 

2.1213 

2.1216 


16.4682 

2.5119 

2.9297 

1.7149 

3.9403 

• 

3.1765 


.5194 

.0607 

.4510 

1.0630 

1.4074 

1.0650 

1.3554 

1.0607 

1.0607 

1.0608 


3.9041 

8.7464 

3.6814 

8.2341 

2.6220 

3.6705 

2.2365 


8.7475 


Table  A4.  January,  1985,  summary  of  correlation  scores. 

Correlation  Scores  Atomic  Absorption 

Synthetic  Oils  January  1985 

Accuracy  I  Repeatability  I  Number 

Limit  AStDev  AFail  Limit  AStDev  AFail  of  labs 


Table  A5.  January,^  1985,  sumary  of  proposed  aeoras. 

Proposed  Scores  Atomic  Emission 

Used  Oils  January  1985 


St  Dev 

Accuracy 

limit 

•Fail 

Repeatability  . 
StDev  limit 

#Fai  l 

Nunber 
of  labs 

Fe 

1.4253 

4.2760 

6 

,4200 

1.2600 

6 

180 

Ag 

.5860 

1.7579 

2 

.0000 

.0000 

4 

180 

Al 

2.0972 

6.2915 

4 

.4005 

1.2014 

5 

180 

Cr 

.7603 

2.2810 

1 

.1646 

.4938 

14 

180 

Cu 

2.7733 

8.3198 

4 

.5965 

1.7896 

4 

180 

Mg 

.9294 

2.7882 

5 

.2163 

.6489 

22 

179 

Si 

.8833 

2.6500 

6 

.3574 

1.0723 

2 

180 

T1 

.7462 

2.2386 

3 

.1405 

.4216 

14 

180 

NO 

1.1451 

3.4352 

0 

.4113 

1.2339 

7 

176 

Ni 

.7389 

2.2168 

3 

.2340 

.7021 

12 

180 

Table  A6.  January,  1985,  sumary  of  proposed  scores. 

Proposed  Scores  Atomic  Emission 

Synthetic  Oils  January  1985 


StDev 

Accuracy 

limit 

•Fail 

Repeatability 
StOev  limit 

*Fai  l 

Nunber 
of  labs 

Fe 

3.8243 

11.4729 

5 

1.5377 

4.6130 

5 

180 

Ag 

1.2390 

3.7169 

3 

.3786 

1.1358 

2 

180 

Al 

1.8333 

5.4999 

2 

.6537 

'1.9612 

7 

180 

Cr 

.9654 

2.8963 

2 

.5081 

1.5242 

4 

180 

Cu 

1.1571 

3.4712 

1 

.4341 

1.3022 

0 

180 

Mg 

2.3905 

7.1716 

6 

.9018 

2.7054 

3 

179 

Si 

9.0119 

27.0357 

7 

3.1042 

9.3125 

6 

180 

Ti 

1.3609 

4.0826 

6 

.5680 

1.7039 

1 

180 

Mo 

.7933 

2.3798 

2 

.3221 

.9662 

5 

176 

Ni 

.8372 

2.5117 

2 

.3957 

1 . 1870 

4 

180 

Table  A7.  January,  1985,  sumary  of  proposed  scores. 

Proposed  Scores  Atomic  Absorption 

Used  Oils  January  19B5 


StOev 

Accuracy 

limit 

•Fail 

Repeatability 
StDev  limit 

•Fail 

Number 
of  Labs 

Fe 

1.6809 

5.0427 

2 

.3023 

.9069 

1 

40 

Ag 

.4845 

1.4534 

0 

.0000 

.0000 

1 

39 

Al 

1.6989 

5.0966 

0 

.3288 

.9864 

1 

39 

Cr 

.7076 

2.1228 

2 

.1600 

.4800 

2 

39 

Cu 

1.1350 

3.4049 

0 

.2786 

.8357 

1 

40 

Mg 

.8625 

2.5876 

1 

.2822 

.8467 

0 

39 

Si 

6.0414 

18.1243 

2 

1.0756 

3.2268 

1 

23 

Ti 

1.0539 

3.1618 

2 

.0000 

.0000 

3 

27 

Mo 

.6761 

2.0284 

2 

.0000 

.0000 

2 

22 

Ni 

.6770 

2.0310 

0 

.0000 

.0000 

1 

34 

Table  A8.  January,  1965,  sumary  of  proposed  scores. 


Proposed  Scores 
Synthetic  Oils 


Atomic  Absorption 
January  1985 


StOev 

Accuracy 

limit 

•Fail 

Repeatability 
StDev  limit 

•Fail 

Number 
of  Labs 

Fe 

8.4609 

25.3826 

3 

2.6683 

8.0049 

0 

40 

Ag 

2.3520 

7.0561 

1 

.5102 

1.5306 

0 

39 

Al 

2.3069 

6.9207 

2 

.4255 

1.2764 

1 

39 

Cr 

2.7603 

8.2808 

0 

.6728 

2.0185 

0 

39 

Cu 

1.1571 

3.4713 

0 

.5078 

1.5233 

2 

40 

Mg 

5.4262 

16.2786 

2 

1.1346 

3.4038 

0 

39 

Si 

18.5882 

55.7646 

2 

5.1979 

15.5938 

0 

23 

Ti 

5.0776 

15.2329 

1 

.8288 

2.4864 

0 

27 

Mo 

.6761 

2.0284 

1 

.0000 

.0000 

1 

22 

Ni 

2.2082 

6.6245 

1 

.5303 

1.5910 

1 

34 
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Table*  A9  through  A24  present  the  indices  of  those  laboratories  which 
failed  accuracy  and/or  repeatability,  for  the  current  correlation  program  and 
for  the  proposed  method  of  scoring. 


Table  A9.  Indices  of  failing  labs 
Correlation  Scores  Atomic  Emission 

Used  Oils  August  1986 


Fe 


Acc  Rep 
9 
33 
39 
55 
96 
152 
172 
177 
181 
182 


Ag 


Acc  Rep 


Al 


Acc  Rep 
11 
49 
151 


Cr 


Acc 'Rep 


Cu 


Acc  Rep 
49 


55 

118 

146 

147 

148 
151 
157 
164 
181 
182 


Mg 


Acc 

1 

3 

9 

13 

15 

16 
24 
33 

39 

40 
55 
79 
88 
90 

95 

96 
98 

112 

117 

127 

131 

147 

148 
152 

169 

170 

171 

172 
175 
177 
181 
182 


»ep 

24 

181 

182 


Si 


Acc  Rep 
9  6 

112  33 
134  59 
155  79 
165  112 
165 


Ti 


Acc  Rep 
182 


Ho 


Acc  Rep 
9  12 
11 
15 
41 
49 
78 
102 
118 
166 
180 


Ni 


Acc  Rep 
146 


Table  A10.  Indices  of  Failing  Labs 
Proposed  Scores  Atomic  Emission 

Used  Oils  August  1986 


Fe 

*9 

Al 

Cr 

Cu 

Mg 

Si 

Ti 

Mo 

Ni 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

152  122 

2 

1 

1 

33 

55 

9  24 

9 

6 

182  42 

12 

146  35 

177  181 

3 

7 

7 

41 

118 

15  181 

112 

33 

67 

177  36 

181 

113 

8 

8 

49 

146 

16  182 

134 

59 

71 

37 

182 

121 

11 

11 

68 

148 

24 

155 

79 

78 

42 

173 

31 

31 

75 

181 

33 

165 

103 

82 

59 

41 

36 

124 

182 

148 

112 

88 

67 

44 

41 

140 

181 

165 

89 

71 

49 

44 

167 

182 

182 

92 

77 

54 

49 

174 

98 

85 

151 

54 

182 

130 

90 

161 

151 

134 

98 

162 

161 

140 

101 

173 

162 

151 

106 

181 

173 

153 

122 

181 

167 

130 

176 

134 

177 

143 

182 

146 

156 

163 

167 

174 

A-3 
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Tabic  A13.  Indices  of  failing  Labs 
Correlation  Scores  Atomic  Absorption 

Used  Oils  August  1986 


Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

Acc  Rep 

14 

3  8 

9 

14  14 

12  12 

15 

8  11 

10 

19 

16 

9  24 

12 

25 

17 

11 

24 

26 

24 

12 

26 

31 

26 

18 

33 

22 

23 

24 

28 

31 

Table  A14.  Indices  of  failing  Labs 
Proposed  Scores  Atomic  Absorption 

Used  Oils  August  1986 


Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep 

12  -  27  11  8  26  9  14  14  12  12 

31  11  26  26 

24 


Acc  Repl  Acc  Rep  Acc  Rep  Acc  Repl  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Repl  Acc  Repl  Acc  Rep 


7  28 1  7  7  4  31  8  12  7  7  7 


Acc  Rep  Ace  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep 

24  28  28  12  17  31  12  12  12  7  20  12 

28  24  18  37  34  26 
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Table  A17.  Indices  of  Failing  labs 
Correlation  Scores  Atomic  Emission 

Used  Oils  January  198S 


Table  A18.  Indices  of  failing  Labs 
Proposed  Scores  Atomic  Emission 

Used  Oils  January  1985 
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Tabic  A19.  Indices  of  Failing  Labs 
Correlation  Scores  Atomic  Emission 

Synthetic  Oils  January  1985 


Acc  Rep I  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep 

a  am  pa  I  /  an  «r/  *»/  *»  enn  v»  n  ct  94/  an  n 


108  50 
128  102 

170  108 

171 


10  71  154  74 
25  108  157  108 


41  110 
76  124 

101  158 

102  174 
138  180 
143 

161 

170 

178 

180 


74  23 

108  118 
119  154 


2  102 
12  103 
31  108 
48  138 
54 


12  51  114 

31  102  160 
38  108 
71  120 
77 
79 
103 
108 
137 

154 

155 
159 
161 
164 
170 

172 

173 
175 


Table  A20.  Indices  of  Failing  Labs 
Proposed  Scores-  Atomic  Emission 

Synthetic  Oils  January  1985 


Acc  Rep 
108  50 
128  67 

170  102 

171  108 
178  136 


Acc  Rep 
31  37 


Acc  Rep 
138  71 


Acc  Rep 
154  74 


Acc  Rep 
23 


151  176 
176 


161  108 
110 


157  108 
119 


Acc  Rep 
31  102 
108  103 
115  108 


Acc  Rep 
29  29 


Acc  Rep  Acc  Rep 
79  108  114  80 


Acc  Rep 
12  17 


108  136 
170  173 


Table  A21.  Indices  of  Failing  Labs 
Correlation  Scores  Atomic  Absorption 

Used  Oils  January  1985 


Acc  Rep  Acc  Rep 
19 
28 
29 
37 


Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep 

19  36  9  33  20  9  9  7  7 

*0  in  1/.  in  on 


Teble  A22.  Indices  of  Failing  Labs 
Proposed  Scores  Atomic  Absorption 

Used  Oils  January  1985 


Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep  Acc  Rep 

28  33  29  26  11  3  33  39  13  14  7  7  7  7  24 

t?  4  4  4/  17  4ft  *Sft  On 


APPENDIX  B 


This  appendix  describes  the  mathematical  computations  used  for  the  pro¬ 
posed  scoring  system.  The  same  procedure  is  followed  for  each  element,  for  each 
type  of  sample,  for  either  type  of  instrument.  As  in  the  text  above,  let 

x^  -  Analysis  result  for  sample  1,  laboratory  i 
y^  -  Analysis  result  for  sample  2,  laboratory  i. 

The  first  step  is  to  perform  the  bivariate  trimming.  Define 

x  -  Sx^/n  -  Average  of  all  sample  1  results 

y  -  Syj/n  -  Average  of  all  sample  2  results 

where  n  is  the  total  number  of  instruments  analyzing  this  sample.  Now  define 

A  -  S(xt  -  i)2/(n  -  1) 

C  -  Z(yt  -  y)2/(n  -  1) 

B  -  S(xt  -  x)(yt  -  y)/(n  -  1) 

and  let  S  be  the  2  by  2  matrix  whose  first  row  is  A,  B  and  whose  second  row  is 

B,  C;  define  T  to  be  the  matrix  inverse  of  S.  This  matrix  T  is  used  to  eval¬ 

uate  n  quadratic  forms,  one  for  each  participating  laboratory.  That  is,  for 
instrument  i  the  quadratic  form  is 

Qi  "  tll(xi  ■  x)2  +  2t12(xi  '  x)(^i  *  y)  +  c22(yi  *  y)2 

where  the  first  row  of  T  is  t^,  t^  and  the  second  row  of  T  is  t^.  t ^ • 

These  values  then  are  ranked  in  order  of  magnitude,  from  smallest  to 
largest  and  are  used  to  trim  off  (no  more  than)  5%  of  the  instruments;  if  for 
example  n  -  183  instruments  had  analyzed  the  sample,  5%  of  n  equals  9.15  so  the 
9  largest  values  identify  those  instruments  to  be  trimmed  off.  The  re¬ 
maining  174  laboratories  are  used  to  determine  the  accuracy  and  repeatability 
limits.  Let  m  represent  the  number  of  instruments  remaining  after  trimming; 
m,  of  course,  is  the  next  larger  integer  above  .95n  (or  .95n  rounded  up). 


The  trimmed  means,  for  Che  x  and  y  scores,  are  the  averages  of  the 
m  remaining  pairs.  -  Hopefully  without  confusion,  let  x  and  y  represent 
these  trimmed  means  and  define 

-  x^  -  x 

Yi  -  y4  *  y 

for  all  n  instruments.  The  accuracy  score  for  instrument  i  then  is 

A£  -  (X£  +  Yi)/2-5 

and  the  repeatability  score  for  instrument  i  is 

Ri  “  (Xi  *  Yi>/2'5- 

We  now  have  n  pairs,  (A^,  R^) ,  one  for  each  instrument.  Using 
m  instrument  pairs  which  were  not  trimmed  initially,  define  the  accuracy 
and  repeatability  standard  deviations  by 

sa  -  (ZA*/(m  .  1}).5 

sr  -  (ZR*/(m  -  l))-5. 

The  accuracy  limit  is  3sfl  and  the  repeatability  limit  is  3sr;  any  instrument, 
trimmed  or  not,  whose  accuracy  score  or  repeatability  score  exceeds  the  respec¬ 
tive  limit  in  magnitude  (absolute  value)  fails  on  that  score. 
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